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Abstract — Since the electricity bill of a data center constitutes 
a significant portion of its overall operational costs, reducing 
this has become important. We investigate cost reduction op- 
portunities that arise by the use of uninterrupted power supply 
(UPS) units as energy storage devices. This represents a deviation 
from the usual use of these devices as mere transitional fail-over 
mechanisms between utility and captive sources such as diesel 
generators. We consider the problem of opportunistically using 
these devices to reduce the time average electric utility bill in a 
data center. Using the technique of Lyapunov optimization, we 
develop an online control algorithm that can optimally exploit 
these devices to minimize the time average cost. This algorithm 
operates without any knowledge of the statistics of the workload 
or electricity cost processes, making it attractive in the presence 
of workload and pricing uncertainties. An interesting feature of 
our algorithm is that its deviation from optimality reduces as the 
storage capacity is increased. Our work opens up a new area in 
data center power management. 

I. Introduction 

Data centers spend a significant portion of their overall 
operational costs towards their electricity bills. As an ex- 
ample, one recent case study suggests that a large 15MW 
data center (on the more energy-efficient end) might spend 
about $1M on its monthly electricity bill. In general, a data 
center spends between 30-50% of its operational expenses 
towards power. A large body of research addresses these 
expenses by reducing the energy consumption of these data 
centers. This includes designing/employing hardware with 
better power/performance trade-offs [ilj-[j3j|, software tech- 
niques for power- aware scheduling workload migration, 
resource consolidation 0, among others. Power prices exhibit 
variations along time, space (geography), and even across 
utility providers. As an example, consider Fig. [T] that shows the 
average hourly spot market prices for the Los Angeles Zone 
LAI obtained from CAISO |6|. These correspond to the week 
of 01/01/2005-01/07/2005 and denote the average price of 1 
MW-Hour of electricity. Consequently, minimization of energy 
consumption need not coincide with that of the electricity bill. 

Given the diversity within power price and availability, 
attention has recently turned towards demand response (DR) 
within data centers. DR within a data center (or a set of 
related data centers) attempts to optimize the electricity bill 
by adapting its needs to the temporal, spatial, and cross-utility 
diversity exhibited by power price. The key idea behind these 
techniques is to preferentially shift power draw (i) to times and 
places or (ii) from utilities offering cheaper prices. Typically 
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Fig. 1. Average hourly spot market price during the week of 01/01/2005 - 
01/07/2005 for LAI Zone (6]. 



some constraints in the form of performance requirements for 
the workload (e.g., response times offered to the clients of a 
Web-based application) limit the cost reduction benefits that 
can result from such DR. Whereas existing DR techniques 
have relied on various forms of workload scheduling/shifting, 
a complementary knob to facilitate such movement of power 
needs is offered by energy storage devices, typically uninter- 
rupted power supply (UPS) units, residing in data centers. 

A data center deploys captive power sources, typically diesel 
generators (DG), that it uses for keeping itself powered up 
when the utility experiences an outage. The UPS units serve 
as a bridging mechanism to facilitate this transition from utility 
to DG: upon a utility failure, the data center is kept powered 
by the UPS unit using energy stored within its batteries, before 
the DG can start up and provide power. Whereas this transition 
takes only 10-20 seconds, UPS units have enough battery 
capacity to keep the entire data center powered at its maximum 
power needs for anywhere between 5-30 minutes. Tapping into 
the energy reserves of the UPS unit can allow a data center to 
improve its electricity bill. Intuitively, the data center would 
store energy within the UPS unit when prices are low and use 
this to augment the draw from the utility when prices are high. 

In this paper, we consider the problem of developing an 
online control policy to exploit the UPS unit along with the 
presence of delay-tolerance within the workload to optimize 
the data center's electricity bill. This is a challenging problem 
because data centers experience time-varying workloads and 
power prices with possibly unknown statistics. Even when 
statistics can be approximated (say by learning using past 
observations), traditional approaches to construct optimal con- 
trol policies involve the use of Markov Decision Theory 
and Dynamic Programming £7]. It is well known that these 
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techniques suffer from the "curse of dimensionaUty" where 
the complexity of computing the optimal strategy grows with 
the system size. Furthermore, such solutions result in hard-to- 
implement systems, where significant re-computation might be 
needed when statistics change. 

In this work, we make use of a different approach that can 
overcome the challenges associated with dynamic program- 
ming. This approach is based on the recently developed tech- 
nique of Lyapunov optimization |8| |9| that enables the design 
of online control algorithms for such time- varying systems. 
These algorithms operate without requiring any knowledge of 
the system statistics and are easy to implement. We design such 
an algorithm for optimally exploiting the UPS unit and delay - 
tolerance of workloads to minimize the time average cost. We 
show that our algorithm can get within 0{1/V) of the optimal 
solution where the maximum value of V is limited by battery 
capacity. We note that, for the same parameters, a dynamic 
programming based approach (if it can be solved) will yield a 
better result than our algorithm. However, this gap reduces as 
the battery capacity is increased. Our algorithm is thus most 
useful when such scaling is practical. 

II. Related Work 

One recent body of work proposes online algorithms for 
using UPS units for cost reduction via shaving workload 
"peaks" that correspond to higher energy prices [|TOl . flTl]. 
This work is highly complementary to ours in that it offers a 
worst-case competitive ratio analysis while our approach looks 
at the average case performance. Whereas a variety of work 
has looked at workload shifting for power cost reduction [2 1 
or other reasons such as performance and availability [5 1, our 
work differs both due to its usage of energy storage as well as 
the cost optimality guarantees offered by our technique. Some 
research has considered consumers with access to multiple 
utility providers, each with a different carbon profile, power 
price and availability and looked at optimizing cost subject to 
performance and/or carbon emissions constraints [12]. Another 
line of work has looked at cost reduction opportunities offered 
by geographical variations within utility prices for data centers 
where portions of workloads could be serviced from one 
of several locations lfT2l . ||T3| . Finally, |[T4l considers the 
use of rechargeable batteries for maximizing system utility 
in a wireless network. While all of this research is highly 
complementary to our work, there are three key differences: 
(i) our investigation of energy storage as an enabler of cost 
reduction, (ii) our use of the technique of Lyapunov optimiza- 
tion which allows us to offer a provably cost optimal solution, 
and (iii) combining energy storage with delay-tolerance within 
workloads. 

III. Basic Model 

We consider a time-slotted model. In the basic model, we 
assume that in every slot, the total power demand generated by 
the data center in that slot must be met in the current slot itself 
(using a combination of power drawn from the utility and the 
battery). Thus, any buffering of the workload generated by the 
data center is not allowed. We will relax this constraint later 
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Fig. 2. Block diagram for the basic model. 



in Sec. [Vll when we allow buffering of some of the workload 
while providing worst case delay guarantees. In the following, 
we use the terms UPS and battery interchangeably. 

A. Workload Model 

Let W{t) be total workload (in units of power) generated 
in slot t. Let P{t) be the total power drawn from the grid in 
slot t out of which R{t) is used to recharge the battery. Also, 
let D{t) be the total power discharged from the battery in slot 
t. Then in the basic model, the following constraint must be 
satisfied in every slot (Fig. O: 



W{t) = P{t) - R{t) + D{t) 



(1) 



Every slot, a control algorithm observes W{t) and makes 
decisions about how much power to draw from the grid in 
that slot, i.e., P{t), and how much to recharge and discharge 
the battery, i.e., R{t) and D{t). Note that by ([B, having chosen 
P{t) and R{t) completely determines D{t). 

Assumptions on the statistics ofW{t): The workload pro- 
cess W{t) is assumed to vary randomly taking values from 
a set W of non-negative values and is not influenced by past 
control decisions. The set W is assumed to be finite, with 
potentially arbitrarily large size. The underlying probability 
distribution or statistical characterization of W{t) is not nec- 
essarily known. We only assume that its maximum value is 
finite, i.e., W{t) < Wmax for all t. Note that unlike existing 
work in this domain, we do not make assumptions such as 
Poisson arrivals or exponential service times. 

For simplicity, in the basic model we assume that W{t) 
evolves according to an i.i.d. process noting that the algo- 
rithm developed for this case can be applied without any 
modifications to non i.i.d. scenarios as well. The analysis and 
performance guarantees for the non i.i.d. case can be obtained 
using the delayed Lyapunov drift and T slot drift techniques 
developed in ^ Q. 

B. Battery Model 

Ideally, we would like to incorporate the following idiosyn- 
crasies of battery operation into our model. First, batteries 
become unreliable as they are charged/discharged, with higher 
depth-of-discharge (DoD) - percentage of maximum charge 
removed during a discharge cycle - causing faster degradation 
in their reliability. This dependence between the useful lifetime 
of a battery and how it is discharged/charged is expressed 
via battery lifetime charts [15|. For example, with lead-acid 
batteries that are commonly used in UPS units, 20% DoD 
yields 1400 cycles lfT6ll . Second, batteries have conversion loss 
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whereby a portion of the energy stored in them is lost when 
discharging them (e.g., about 10-15% for lead-acid batteries). 
Furthermore, certain regions of battery operation (high rate of 
discharge) are more inefficient than others. Finally, the storage 
itself maybe "leaky", so that the stored energy decreases over 
time, even in the absence of any discharging. 

For simplicity, in the basic model we will assume that 
there is no power loss either in recharging or discharging the 
batteries, noting that this can be easily generalized to the case 
where a fraction of R{t)^D{t) is lost. We will also assume 
that the batteries are not leaky, so that the stored energy level 
decreases only when they are discharged. This is a reasonable 
assumption when the time scale over which the loss takes 
place is much larger than that of interest to us. To model the 
effect of repeated recharging and discharging on the battery's 
lifetime, we assume that with each recharge and discharge 
operation, a fixed cost (in dollars) of Crc and Cdc respectively 
is incurred. The choice of these parameters would affect the 
trade-off between the cost of the battery itself and the cost 
reduction benefits it offers. For example, suppose a new battery 
costs B dollars and it can sustain N discharge/charge cycles 
(ignoring DoD for now). Then setting Crc = Cdc = B/N 
would amount to expecting the battery to "pay for itself" by 
augmenting the utility TV times over its lifetime. 

In any slot, we assume that one can either recharge or 
discharge the battery or do neither, but not both. This means 
that for all t, we have: 

R{t) > ^ D{t) = 0, D{t) > ^ R{t) = (2) 

Let Y{t) denote the battery energy level in slot t. Then, the 
dynamics of Y{t) can be expressed as: 

Y{t + 1) = Y{t) - D{t) + R{t) (3) 

The battery is assumed to have a finite capacity Ymax so that 
Y{t) < Ymax for all t. Further, for the purpose of reliability, 
it may be required to ensure that a minimum energy level 
Ymin > is maintained at all times. For example, this could 
represent the amount of energy required to support the data 
center operations until a secondary power source (such as 
DG) is activated in the event of a grid outage. Recall that the 
UPS unit is integral to the availability of power supply to the 
data center upon utility outage. Indiscriminate discharging of 
UPS can leave the data center in situations where it is unable 
to safely fail-over to DG upon a utility outage. Therefore, 
discharging the UPS must be done carefully so that it still 
possesses enough charge so reliably carry out its role as a 
transition device between utility and DG. Thus, the following 
condition must be met in every slot under any feasible control 
algorithm: 

Ymin — ■^(^) — ^max (4) 

The effectiveness of the online control algorithm we present 
in Sec. |Vl will depend on the magnitude of the difference 
^max — Ymin- most practical scenarios of interest, this 
value is expected to be at least moderately large: recent work 
suggests that storing energy Ymin to last about a minute is 
sufficient to offer reliable data center operation [17], while 



Ymax can vary between 5-20 minutes (or even higher) due to 
reasons such as UPS units being available only in certain sizes 
and the need to keep room for future IT growth. Furthermore, 
the UPS units are sized based on the maximum provisioned 
capacity of the data center, which is itself often substantially 
(up to twice [18]) higher than the maximum actual power 
demand. 

The initial charge level in the battery is given by Yinit and 
satisfies Ymin < Yinit < Ymax- Finally, we assume that the 
maximum amounts by which we can recharge or discharge the 
battery in any slot are bounded. Thus, we have Vt: 

< R{t) < Rmax. < D{t) < Dmax (5) 

We will assume that Ymax - Ymin > Rmax + Dmax whilc 
noting that in practice, Ymax — Ymin is much larger than 
Rmax + Dmax- Notc that any feasible control decision on 
R{t)^D(t) must ensure that both of the constraints and 
(0) are satisfied. This is equivalent to the following: 

< R{t) < minlRmax. Ymax - Y {t)] (6) 

< D{t) < imn[Dmax. Y(t) - Ym^n] (7) 

C. Cost Model 

The cost per unit of power drawn from the grid in slot t 
is denoted by C{t). In general, it can depend on both P{t), 
the total amount of power drawn in slot t, and an auxiliary 
state variable S{t), that captures parameters such as time of 
day, identity of the utility provider, etc. For example, the per 
unit cost may be higher during business hours, etc. Similarly, 
for any fixed S(t), it may be the case that C{t) increases with 
P{t) so that per unit cost of electricity increases as more power 
is drawn. This may be because the utility provider wants to 
discourage heavier loading on the grid. Thus, we assume that 
C{t) is a function of both S{t) and P{t) and we denote this 
as: 

C{t) = C{S(t),P(t)) (8) 

For notational convenience, we will use C(t) to denote the per 
unit cost in the rest of the paper noting that the dependence 
of C(t) on S(t) and P(t) is implicit. 

The auxiliary state process S{t) is assumed to evolve 
independently of the decisions taken by any control policy. 
For simplicity, we assume that every slot it takes values 
from a finite but arbitrarily large set S in an i.i.d. fashion 
according to a potentially unknown distribution. This can again 
be generalized to non i.i.d. Markov modulated scenarios using 
the techniques developed in |0 ||9l. For each S{t), the unit cost 
is assumed to be a non-decreasing function of P{t). Note that 
it is not necessarily convex or strictly monotonic or continuous. 
This is quite general and can be used to model a variety of 
scenarios. A special case is when C{t) is only a function of 
S{t). The optimal control action for this case has a particularly 
simple form and we will highlight this in Sec. IV-All The unit 
cost is assumed to be non-negative and finite for all S{t)^ P{t)- 

We assume that the maximum amount of power that can be 
drawn from the grid in any slot is upper bounded by Ppeak- 
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Thus, we have for all t: 



< P{t) < Ppeak 



(9) 



Note that if we consider the original scenario where batteries 
are not used, then Ppeak must be such that all workload can 
be satisfied. Therefore, Ppeak ^ Wmax- 

Finally, let Cmax and Cmin denote the maximum and min- 
imum per unit cost respectively over all S{t)^P{t). Also let 
Xmin > be a constant such that for any Pi, P2 ^ [0, Ppeak] 
where Pi < P2, the following holds for all x ^ Xmin' 



Pi{-X + C{Pi,S)) > P2{-x + C{P2,S)) 



(10) 



For example, when C{t) does not depend on P{t), then 
Xmin = Cmax satisfics ([Tol) . This follows by noting that 
{—Cmax C{t)) < for all t. Similarly, suppose C{t) 
does not depend on S{t), but is continuous, convex, and 
increasing in P{t). Then, it can be shown that Xmin = 

C (Ppeak) + PpeakC {Ppeak) SatisficS ^ whcrC C {Ppeak) 

denotes the derivative of C{t) evaluated at Ppeak- In the 
following, we assume that such a finite Xmin exists for the 
given cost model. We further assume that Xmin > Cmin- The 
case of Xmin = Cmin corrcsponds to the degenerate case 
where the unit cost is fixed for all times and we do not consider 
it. 

What is known in each slot?: We assume that the value of 
S{t) and the form of the function C{P{t), S{t)) for that slot 
is known. For example, this may be obtained beforehand using 
pre-advertised prices by the utility provider. We assume that 
given an S{t) = s, C{t) is a deterministic function of P{t) 
and this holds for all s. Similarly, the amount of incoming 
workload W{t) is known at the beginning of each slot. 

Given this model, our goal is to design a control algorithm 
that minimizes the time average cost while meeting all the 
constraints. This is formalized in the next section. 

IV. Control Objective 

Let P(t), R{t) and D{t) denote the control decisions made 
in slot t by any feasible policy under the basic model as 
discussed in Sec.|lIIl These must satisfy the constraints ([B, Q, 
©, d?]), and ^ every slot. We define the following indicator 
variables that are functions of the control decisions regarding 
a recharge or discharge operation in slot t\ 



if R{t) > 
else 



I e 



if D{t) > 
else 



Note that by at most one of lij(i) and lc{t) can take 
the value 1. Then the total cost incurred in slot t is given by 
P{t)C{t) + lR{t)Crc + iD{t)Cdc- The time-average cost under 
this policy is given by: 

1 

lim - VE{P(t)C(t) + lR{T)Cre + IcMQe} dD 

where the expectation above is with respect to the potential 
randomness of the control policy. Assuming for the time being 
that this limit exists, our goal is to design a control algorithm 
that minimizes this time average cost subject to the constraints 



described in the basic model. Mathematically, this can be 
stated as the following stochastic optimization problem: 

PI : 

1 ^"^ 

Minimize: lim - V E{P(r)C(r) + lR{r)Crc + lD{r)Cdc} 



t^OO t 

r=0 

Subject to: Constraints © , (El) , © , (0) , (|9|) 

The finite capacity and underflow constraints ^ make 
this a particularly challenging problem to solve even if the 
statistical descriptions of the workload and unit cost process 
are known. For example, the traditional approach based on 
Dynamic Programming [7 1 would have to compute the optimal 
control action for all possible combinations of the battery 
charge level and the system state {S{t)^W{t)). Instead, we 
take an alternate approach based on the technique of Lyapunov 
optimization, taking the finite size queues constraint explicitly 
into account. 

Note that a solution to the problem PI is a control policy 
that determines the sequence of feasible control decisions 
P{t), R{t), D{t), to be used. Let (j)opt denote the value of 
the objective in problem PI under an optimal control policy. 
Define the time-average rate of recharge and discharge under 
any policy as follows: 



R= lim -yE{i?(r)}, D= lim - y"E{D{r)} (12) 

Now consider the following problem: 
P2 : 

1 '"^ 

Minimize: lim - V E{P(r)C(r) + lR{r)Crc + lD{r)Cdc} 

Subject to: Constraints (HI , (|2]) , (|5]) , dU 

R = D (13) 

Let (/) denote the value of the objective in problem P2 under 
an optimal control policy. By comparing PI and P2, it can be 
shown that P2 is less constrained than PI. Specifically, any 
feasible solution to PI would also satisfy P2. To see this, con- 
sider any policy that satisfies © and (|7]) for all t. This ensures 
that constraints and (O are always met by this policy. Then 
summing equation © over all r G {0,l,2,...,t — 1} under 
this policy and taking expectation of both sides yields: 

t-i 

E {Y{t)} - Y,mt = ^ E {R{t) - D{t)} 

r=0 

Since Ymin ^ ^(^) ^ ^max for all t, dividing both sides by t 
and taking limits as t ^ oo yields R = D. Thus, this policy 
satisfies constraint ([T3]) of P2. Therefore, any feasible solution 
to PI also satisfies P2. This implies that the optimal value of 
P2 cannot exceed that of PI, so that (j) < (j)opt- 

Our approach to solving PI will be based on this observa- 
tion. We first note that it is easier to characterize the optimal 
solution to P2. This is because the dependence on Y{t) has 
been removed. Specifically, it can be shown that the optimal 
solution to P2 can be achieved by a stationary, randomized 
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control policy that chooses control actions P{t)^D{t)^R{t) 
every slot purely as a function (possibly randomized) of the 
current state {W{t)^S{t)) and independent of the battery 
charge level Y{t). This fact is presented in the following 

^^^^^^emma 1: (Optimal Stationary, Randomized Policy): If the 
workload process W{t) and auxiliary process S{t) are i.i.d. 
over slots, then there exists a stationary, randomized policy 
that takes control decisions P^^^'^t) , R^*''\t) , D^^^'^t) every 
slot purely as a function (possibly randomized) of the current 
state {W{t), S{t)) while satisfying the constraints O, ©, ©, 
(|9]) and providing the following guarantees: 

E {R'*''\t)} = E {D'*''\t)} (14) 

where the expectations above are with respect to the station- 
ary distribution of {W{t)^S{t)) and the randomized control 
decisions. 

Proof: This result follows from the framework in [8], [|9] 
and is omitted for brevity. ■ 
It should be noted that while it is possible to characterize 
and potentially compute such a policy, it may not be feasible 
for the original problem PI as it could violate the constraints 
© and d?]). However, the existence of such a policy can be 
used to construct an approximately optimal policy that meets 
all the constraints of PI using the technique of Lyapunov 
optimization |8l 191. This policy is dynamic and does not 
require knowledge of the statistical description of the workload 
and cost processes. We present this policy and derive its 
performance guarantees in the next section. This dynamic 
policy is approximately optimal where the approximation 
factor improves as the battery capacity increases. Also note 
that the distance from optimality for our policy is measured 
in terms of (j). However, since (j) < (j)opt^ in practice, the 
approximation factor is better than the analytical bounds. 



V. Optimal Control Algorithm 

We now present an online control algorithm that approx- 
imately solves PI. This algorithm uses a control parameter 
F > that affects the distance from optimality as shown 
later. This algorithm also makes use of a "queueing" state 
variable X{t) to track the battery charge level and is defined 
as follows: 



X{t) = Yit) - VXr 



Y 



(16) 



Recall that Y{t) denotes the actual battery charge level in slot 
t and evolves according to ©. It can be seen that X{t) is 
simply a shifted version of Y{t) and its dynamics is given by: 



X{t ^ 1) = X{t) - D{t) ^ R{t) 



(17) 



Note that X{t) can be negative. We will show that this 
definition enables our algorithm to ensure that the constraint 
(|4]) is met. 

We are now ready to state the dynamic control algorithm. 
Let {W{t),S{t)) and X{t) denote the system state in slot t. 
Then the dynamic algorithm chooses control action P{t) as 
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Fig. 3. Periodic W{t) process in the example. 



the solution to the following optimization problem: 
P3 : 

Minimize: X(t)P(t) ^v\p(t)C(t) + lR{t)Crc + lDit)Cd 
Subject to: Constraints (HJ , (0) , © , ® 

The constraints above result in the following constraint on 
P{t): 



Low < Pit) < Phigh 



(18) 



where 

Plow max[0,W(t) - Dmax] and Phigh = 

mi-n.[Ppeak-,W{t) + Rmax]- Let P* (t) dcnotc the optimal 
solution to P3. Then, the dynamic algorithm chooses the 
recharge and discharge values as follows. 



R\t) 



D\t) 



P 




■■(t)-w(t) 



W{t) 




p*(t) 



if P*(t) > W(t) 
else 

if P*(t) < W{t) 
else 



Note that if P*(t) = W{t), then both i?*(t) = and P>*(t) = 
and all demand is met using power drawn from the grid. It 
can be seen from the above that the control decisions satisfy 
the constraints < i?*(t) < Rmax and < P)*(t) < Dmax- 
That the finite battery constraints and the constraints (|6]l, (|7]) 
are also met will be shown in Sec. IV-CI 

After computing these quantities, the algorithm implements 
them and updates the queueing variable X{t) according to 
([TTl) . This process is repeated every slot. Note that in solving 
P3, the control algorithm only makes use of the current system 
state values and does not require knowledge of the statistics 
of the workload or unit cost processes. Thus, it is myopic and 
greedy in nature. From P3, it is seen that the algorithm tries 
to recharge the battery when X{t) is negative and per unit 
cost is low. And it tries to discharge the battery when X{t) 
is positive. That this is sufficient to achieve optimality will 
be shown in Theorem [T] The queueing variable X{t) plays a 
crucial role as making decisions purely based on prices is not 
necessarily optimal. 

To get some intuition behind the working of this algorithm, 
consider the following simple example. Suppose W{t) can 
take three possible values from the set {Wiowi Wmid^ Whigh} 
where Wiow < Wmid < Whigh- Similarly, C{t) can take 
three possible values in {C low.C mid, C high} where Ciow < 
Cmid < Chigh and does not depend on P{t). We assume that 
the workload process evolves in a frame-based periodic fash- 
ion. Specifically, in every odd numbered frame, W{t) = Wmid 
for all except the last slot of the frame when W{t) = Wiow 
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In every even numbered frame, W{t) = Wmid for all except 
the last slot of the frame when W{t) = Whigh- This is 
illustrated in Fig. [3] The C{t) process evolves similarly, such 
that C{t) = Clow when W{t) = Wiow, C{t) = Cmid when 
W{t) = Wrmd. and C{t) = Cmgh when W(t) = Whigh. 

In the following, we assume a frame size of 5 slots with 
Wiow 10, Wmid 15, and Whigh 20 units. Also, 
Clow = 2, Cmid = 6, and Chigh = 10 dollars. Finally, 

Rmax — -Dmax — 10, Ppeak — 20, Crc — Cdc — 5, 

yinit ymin and we vary Ymax > Rmax + Dmax' In 
this example, intuitively, an optimal algorithm that knows the 
workload and unit cost process beforehand would recharge the 
battery as much as possible when C{t) = Ciow and discharge 
it as much as possible when C{t) = Chigh- In fact, it can 
be shown that the following strategy is feasible and achieves 
minimum average cost: 

. If C{t) = Ciow.W{t) = Wlow, then P{t) = Wiow + 

Rmax, = Rmax, ^(^) = 0. 

. If C{t) = Cm^d.W{t) = Wm^d, then P{t) = Wm^d, 

R{t) = 0, D{t) = 0. 

. If C{t) = Chigh. W{t) = WMgh, then P{t) = WMgh - 
Dmax, R{t) = 0, D{t) = D 

max- 

The time average cost resulting from this strategy can be easily 
calculated and is given by 87.0 dollars/slot for all Ymax > 
10. Also, we note that the cost resulting from an algorithm 
that does not use the battery in this example is given by 94.0 
dollars/slot. 

Now we simulate the dynamic algorithm for this example 
for different values of Ymax for 1000 slots (200 frames). 
The value of V is chosen to be — 



-Drr 



Chigh Clow 

Ymax -'20 ^^j^jg choice will become clear in Sec. IV-Bl when we 
relate V to the battery capacity). Note that the number of slots 
for which a fully charged battery can sustain the data center 
at maximum load is Ymax /Whigh - 

In Table IH we show the time average cost achieved for 
different values of Ymax- It can be seen that as Ymax in- 
creases, the time average cost approaches the optimal value. 
(This behavior will be formalized in Theorem [1} This is 
remarkable given that the dynamic algorithm operates without 
any knowledge of the future workload and cost processes. 
To examine the behavior of the dynamic algorithm in more 
detail, we fix Ymax = 100 and look at the sample paths of the 
control decisions taken by the optimal offline algorithm and 
the dynamic algorithm during the first 200 slots. This is shown 
in Figs, m and m It can be seen that initially, the dynamic tends 
to perform suboptimally. But eventually it learns to make close 
to optimal decisions. 

It might be tempting to conclude from this example that an 
algorithm based on a price threshold is optimal. Specifically, 
such an algorithm makes a recharge vs. discharge decision 
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Fig. 4. P(t) under the offline optimal solution with Ymax = 100. 
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Fig. 5. P{t) under the Dynamic Algorithm with Ym 



100. 



depending on whether the current price C{t) is smaller or 
larger than a threshold. However, it is easy to construct 
examples where the dynamic algorithm outperforms such a 
threshold based algorithm. Specifically, suppose that the W{t) 
process takes values from the interval [10, 90] uniformly at 
random every slot. Also, suppose C{t) takes values from 
the set {2, 6, 10} dollars uniformly at random every slot. We 
fix the other parameters as follows: Rmax = Dmax = 10. 

Ppeak — 90, Crc ~ Cdc ~ 1? ^init Ymin and 

Ymax = 100. We then simulate a threshold based algorithm 
for different values of the threshold in the set {2,6, 10} and 
select the one that yields the smallest cost. This was found to 
be 280.7 dollars/slot. We then simulate the dynamic algorithm 
for 10000 slots with V = = 10.0 and it yields an 

average cost of 275.5 dollars/slot. We also note that the cost 
resulting from an algorithm that does not use the battery in 
this example is given by 300.73 dollars/slot. 

We now establish two properties of the structure of the 
optimal solution to P3 that will be useful in analyzing its 
performance later. 

Lemma 2: The optimal solution to P3 has the following 
properties: 

1) If X{t) > —VCmin, then the optimal solution always 
chooses = 0. 

2) If X{t) < —Vxmin, then the optimal solution always 
chooses = 0. 

Proof: See Appendix A. ■ 
A. Solving P3 

In general, the complexity of solving P3 depends on the 
structure of the unit cost function C{t). For many cases of 
practical interest, P3 is easy to solve and admits closed form 
solutions that can be implemented in real time. We consider 
two such cases here. Let 0{t) denote the value of the objective 
in P3 when there is no recharge or discharge. Thus 0{t) = 
W{t){X{t) + VC{t)). 
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1) C{t) does not depend on P{t): Suppose that C{t) 
depends only on S{t) and not on P{t). We can rewrite the 
expression in the objective of P3 as P{t){X{t) + VC{t)) + 
^R{t)VCrc + ^D{t)VCdc- Then, the optimal solution has the 
following simple threshold structure. 

1) If X{t) + VC{t) > 0, then = so that there is no 
recharge and we have the following two cases: 

a) If Piow{X{t) + VC{t))^ + VCdc < 0{t), then dis- 
charge as much as possible, so that we get D*{t) = 
min[iy(t), Dmax], P*{t) = max[0, W{t) - D^ax]- 

b) Else, draw all power from the grid. This yields 

= and P*(t) = W(t). 

2) Else if X{t) + VC{t) < 0, then = so that there 
is no discharge and we have the following two cases: 

a) If Phigh{X{t) + VC{t)) + VCrc < 0{t), then recharge 
as much as possible. This yields R*{t) = min [Ppeafc — 

max 

] and P*{t) = mm[Ppeak, W{t) + R 

max • 

b) Else, draw all power from the grid. This yields R* (t) = 
and P*(t) = W{t). 

We will show that this solution is feasible and does not 
violate the finite battery constraint in Sec. IV-CI 

2) C(t) convex, increasing in P{t): Next suppose for each 
S{t), C(t) is convex and increasing in P{t). For exam- 
ple, C{S{t),P{t)) may have the form a{S{t))P'^{t) where 
a{S{t)) > for all S{t). In this case, P3 becomes a standard 
convex optimization problem in a single variable P{t) and can 
be solved efficiently. The full solution is provided in Appendix 



B. Performance Theorem 

We first define an upper bound Vmax on the maximum value 
that V can take in our algorithm. 

Rmax Dmax 



Y —Y 

A -'- max ^ mm 



(19) 



X.min 

Then we have the following result. 

Theorem 1: (Algorithm Performance) Suppose the initial 
battery charge level Yinu satisfies Ymin < yinit < Ymax- Then 
implementing the algorithm above with any fixed parameter V 
such that < F < Vmax for all t e {0, 1,2,.. .} results in 
the following performance guarantees: 

1) The queue X{t) is deterministically upper and lower 
bounded for all t as follows: 

VXmin Dmax ^ ^ (^) ^ ^max ^min 

-i-^max -Vx mm 

(20) 

2) The actual battery level Y{t) satisfies Ymin ^ ^(f) ^ 
Ymax for all t. 

3) All control decisions are feasible. 

4) If W{t) and S{t) are i.i.d. over slots, then the time- 
average cost under the dynamic algorithm is within B /V 
of the optimal value: 



1 

lim - VE{P(r)C(r) + lR{T)Crc + l^MQc} 

t^oo t ^ — i 

-B/V (21) 



where P is a constant given by P = 
and (j)opt is the optimal solution to PI under any feasible 
control algorithm (possibly with knowledge of future 
events). 

Theorem [T] part 4 shows that by choosing larger V, the time- 
average cost under the dynamic algorithm can be pushed closer 



to the minimum possible value 
how large V can be chosen. 



C. Proof of Theorem [7] 



However, Vm 



limits 



Here we prove Theorem [T] 

Proof: (Theorem [T] part 1) We first show that (l2Ql) holds 
for t = 0. We have that 



Ym 



(22) 



<y(o) 

Using the definition (O, we have that F(0) = X{Q) + 
+ Dmax + Ymin- Using this in (|22]), we get: 

Ymin — ^(0) ~^ V\min H~ P^max H~ Ymin — Ymax 

This yields 

Vx^min Pmax — ^(0) — Ymax Ymin Pmax 

-Vx mm 

Now suppose (I20I) holds for slot t. We will show that it 
also holds for slot t -\- 1. First, suppose —VCmin < X{t) < 
Ymax -Ymin -Dmax - Vxmin- Then, from LcmmaO we have 
that P*(t) = 0. Thus, using dn]), we have that X{t + 1) < 

X{t) < Ymax -Ymin- Dmax -VXmin- Ncxt, SUppOSC X(t) < 

—VCmin- Then, the maximum possible increase is Rmax so 
that X(t + 1) < -VCmin + Rmax' Now for all V such that 
< V < Vmax, we have that -VCmin + R max — Ymax 
Ymin — Dmax " Vxmin- This follows from the definition ([T9l) 
and the fact that Xmin ^ Cmin- Thus, we have X{t -\- 1) < 

Ymax Ymin Dmax VXmin- 

Next, suppose —VXmin Dmax — -X if) ^ VXmin- 
Then, from Lemma [21 we have that D^{t) = 0. Thus, using 
O we have that X{t + 1) > X{t) > -Vxmin - Dmax- 
Next, suppose — Vxmin ^ X(f)- Then the maximum possible 
decrease is Dmax so that X(t+1) > -Vxmin-Dmax for this 
case as well. This shows that X(t + 1) > —Vxmin — Dmax- 
Combining these two bounds proves (l2Ql) . ■ 
Proof: (Theorem [T] parts 2 and 3) Part 2 directly follows 
from (O and (O. Using Y{t) = X{t) + Vxmin + Dmax + 



Ymin in the lower bound in (l2Ql) . we have: —VX' 



-Dr, 



< 



< 



Y{t) - Vxmin - Dmax - Ymin, i.e., Ymin <Y(t). Similarly, 
using Y{t) = X(t) + Vxmin + Dmax + Ymin in the upper 
bound in ([20|), we have: Y{t) - Vxmin - Dmax - Ymin < 

Ymax Ymin Dmax V Xmin, i'^., Y {t) ^ Ymax- 

Part 3 now follows from part 2 and the constraint on P{t) 
in P3. ■ 

Proof: (Theorem [T] part 4) We make use of the technique 
of Lyapunov optimization to show ([2T]) . We start by defining 
a Lyapunov function as a scalar measure of congestion in 
the system. Specifically, we define the following Lyapunov 
function: L{X{t))^\X'^{t). Define the conditional 1-slot Lya- 
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punov drift as follows: 

A{X{t))AE{L{X{t- 



i))-L{xmxm 



(23) 



Using ([TTl) , A{X{t)) can be bounded as follows (see Appendix 
B for details): 



A{X{t)) < B - X{t)E{D{t) - R{t)\X{t)} (24) 



where B 



Following the Lyapunov opti- 



mization framework, we add to both sides of (l24l) the penalty 
term VE{P{t)C{t) + lR{t)Crc + lD{t)Cdc\X{t)} to get the 
following: 

A{X{t)) + VE{P{t)C{t) + lR{t)Crc + lD{t)Cd,\X{t)} 
<B- X{ty& {D{t) - R{t)\X{t)} 

+ V¥.{P{t)C{t) + lR{t)Cre + lD{t)Cde\X{t)} 

(25) 

Using the relation W{t) = P{t) — R{t)+D{t), we can rewrite 
the above as: 

A{X{t)) + VE{P{t)C{t) + lR{t)Crc + lD{t)Cd,\X{t)} < 
B - X{t)E{W{t)\X{t)} + X{t)E{P{t)\X{t)} 

+ VE {P{t)C{t) + lR{t)Cre + lD{t)Cdc\X{t)} (26) 

Comparing this with P3, it can be seen that given any queue 
value X{t), our control algorithm is designed to minimize 
the right hand side of (l26l ) over all possible feasible control 
policies. This includes the optimal, stationary, randomized 
policy given in Lemma[T] Then, plugging the control decisions 
corresponding to the stationary, randomized policy, it can be 
shown that: 



A(X(t)) + VE {P{t)C{t) + lR{t)Crc - 

B + vE {p"*"*(t)c'*"*(t) + i|j*"*(t)a 

= B + V4,<B + V4>opt 



lD{t)Cdc\X{t)} < 

\t)Cdc\X{t)} 



-1 stat ( 



Taking the expectation of both sides and using the law of 
iterated expectations and summing over t G {0,1,2,...,T — 
1}, we get: 



T-l 



VE{P{t)C{t) + lR{t)Crc + lD{t)Cdc} < 



t=0 

BT- 



VT^opt - E {L{X{T))} + E {L{X{0))} 
Dividing both sides by VT and taking limit as T ^ oo yields: 



1 



T-l 




Data Center 



Fig. 6. Block diagram for the extended model with delay tolerant and delay 
intolerant workloads. 



delay tolerant and delay intolerant components. Similar to the 
workload in the basic model, the delay intolerant workload 
cannot be buffered and must be served immediately. However, 
the delay tolerant component may be buffered and served later. 
As an example, data centers run virus scanning programs 
on most of their servers routinely (say once per day). As 
long as a virus scan is executed once a day, their purpose is 
served - it does not matter what time of the day is chosen for 
this. The ability to delay some of the workload gives more 
opportunities to reduce the average power cost in addition 
to using the battery. We assume that our data center has 
system mechanisms to implement such buffering of specified 
workloads. 

In the following, we will denote the total workload gener- 
ated in slot t by W{t). This consists of the delay tolerant 
and intolerant components denoted by Wi{t) and W2{t) 
respectively, so that W{t) = Wi{t) + W2{t) for all t. Similar 
to the basic model, we use P(t), R{t)^ D{t) to denote the total 
power drawn from the grid, the total power used to recharge 
the battery and the total power discharged from the battery in 
slot t, respectively. Thus, the total amount available to serve 
the workload is given by P{t) - R{t) + D{t). Let j{t) denote 
the fraction of this that is used to serve the delay tolerant 
workload in slot t. Then the amount used to serve the delay 
intolerant workload is (1 - -f{t)){P{t) - R{t) + D{t)). Note 
that the following constraint must be satisfied every slot: 



< 7(t) < 1 



(27) 



We next define U {t) as the unfinished work for the delay 
tolerant workload in slot t. The dynamics for U{t) can be 
expressed as: 



-^opt 



lim ^ yE{P{t)C{t) + lR{t)Crc + lD{t)Cdc} < 

t=0 

where we have used the fact that E{L(X(0))} is finite and 
that E {L{X{T))} is non-negative. ■ 

VI. Extensions to Basic Model 

In this section, we extend the basic model of Sec. |lII|to the 
case where portions of the workload are delay-tolerant in the 
sense they can be postponed by a certain amount without af- 
fecting the utility the data center derives from executing them. 
We refer to such postponement as buffering the workload. 
Specifically, we assume that the total workload consists of both 



B/¥{t + 1) = max[[/(t) - -f{t){P{t) - R(t) 



(28) 



For the delay intolerant workload, there are no such queues 
since all incoming workload must be served in the same slot. 
This means: 



W2(t) = {l-^(t)){P(t)-R{t)^D(t)) 



(29) 



The block diagram for this extended model is shown in Fig. |6l 
Similar to the basic model, we assume that for z = 1, 2, Wi{t) 
varies randomly in an i.i.d. fashion, taking values from a set 
Wi of non-negative values. We assume that Wi(t) + W2{t) < 
Wmax for all t. We also assume that Wi{t) < Wi^rnax < 
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Wmax and W2{t) < W2,max < Wmax for all t. We further 
assume that Ppeak > Wmax ^ ^^^[Rmax , Dmax]- We use the 
same model for battery and unit cost as in Sec. Hill 

Our objective is to minimize the time-average cost subject 
to meeting all the constraints (such as finite battery size and 
([29l) ) and ensuring finite average delay for the delay tolerant 
workload. This can be stated as: 



P4 : 



1^. 



Minimize: Hm -^E{P{r)C{r) + li^(r)ac + lD{r)Cdc} 



t^oo t 



r=0 

Subject to: Constraints 0,©,®,(|7|),(|9|),(|2^ 
Finite average delay for Wi {t) 

Similar to the basic model, we consider the following relaxed 
problem: 

P5 : 

1 '"^ 

Minimize: lim - VE{P(r)C(r) + lR{T)Crc + lD{T)Cdc} 

Subject to: Constraints (|2)) . (|5]) . (|9)) . (|Z7)) , (|29)) 

R = D (30) 
I7<oo (31) 



A. Delay-Aware Queue 

In order to provide worst case delay guarantees to the delay 
tolerant workload, we will make use of the technique of e- 
persistent queue |19|. Specifically, we define a virtual queue 
Z{t) as follows: 

Z{t + 1) = [Z{t) - j{t){P{t) - R{t) + D{t)) + eluit)]^ 

(36) 

where e > is a parameter to be specified later, lu{t) is an 
indicator variable that is 1 if {t) > and else, and = 
max[x,0]. The objective of this virtual queue is to enable 
the provision of worst-case delay guarantee on any buffered 
workload Wi{t). Specifically, if any control algorithm ensures 
that U{t) < Umax and Z{t) < Zmax for all t, then the worst 
case delay can be bounded. This is shown in the following: 

Lemma 4: (Worst Case Delay) Suppose a control algorithm 
ensures that U{t) < Umax and Z{t) < Zmax for all t, where 
Umax and Zmax are some positive constants. Then the worst 
case delay for any delay tolerant workload is at most 5max 
slots where: 



where U is the time average expected queue backlog for the 
delay tolerant workload and is defined as: 



_ 1 ^-1 

[74 limsup- VlE{/7(r)} 



(32) 



Let (j)ext and (pext denote the optimal value for problems P4 
and P5 respectively. Since P5 is less constrained than P4, we 
have that (pext < ^ 
holds: 



Similar to Lemma [T] the following 



Lemma 3: (Optimal Stationary, Randomized Policy): If the 
workload process Wi{t)^W2{t) and auxiliary process S{t) 
are i.i.d. over slots, then there exists a stationary, randomized 
policy that takes control decisions R{t)^ D{t)^^{t) every 
slot purely as a function (possibly randomized) of the current 
state {Wi{t) ^W2{t) ^ S {t)) while satisfying the constraints 
(l29l) , O, Q, (O, ([27l) and providing the following guarantees: 

E|^(t)| =E|z)(t)| (33) 

E{7(t)(P(t)-^(t)+Z)(t))} >E{iyi(t)} (34) 

E [P{t)C{t) + iR{t)Crc + iD{t)Cdc} = kxt (35) 

where the expectations above are with respect to the station- 
ary distribution of {Wi{t)^W2{t)^ S{t)) and the randomized 
control decisions. 

Proof: This result follows from the framework in JS], flg] 
and is omitted for brevity. ■ 

The condition ([34b only guarantees queueing stability, not 
bounded worst case delay. We will now design a dynamic 
control algorithm that will yield bounded worst case delay 
while guaranteeing average cost that is within 0{1/V) of (pext- 



(37) 



Proof: Consider a new arrival Wi{t) in any slot t. We 
will show that this is served on or before time t + 5max- We 
argue by contradiction. Suppose this workload is not served by 
t^^max- Then for all slots r G {t + 1, t + . . .,t^5max}. it 
must be the case that U{t) > (else Wi (t) would have been 
served before r). This implies that lt/(r) = 1 and using ([36b , 
we have: 

Z{r + 1) > Z{r) - 7(r)(P(r) - R{r) + D{r)) + e 
Summing for all r G {t + 1, t + 2, . . . , t + Smax}, we get: 

^max 

+ 1)-Z{t + 1)>5 

max^ 



J2 [l{r){P{T)-R{T) + D{T))] 



r=t+l 

Using the fact that Z{t^Smax^'^) < ^max and Z(t + 1) > 0, 
we get: 

^max^ Zmax 

(38) 

Note that by ([28]), Wi (t) is part of the backlog /7(t + 1). Since 

U{t -\- 1) < Umax and since the service is FIFO, it will be 

served on or before time t-\-Smax whenever at least Umax units 

of power is used to serve the delay tolerant workload during 

the interval (t + 1, . . . , t + Smax)- Since we have assumed that 

Wi{t) is not served by t -\- Smax, it rnust be the case that 
Y^^+()max^.Y_^/n/_^ , ^^/_^M ^ tt tt- .u- • 



([38b , we have: 



S=m [7(T)(P(r) - R{r) + D{r))] < Umax- Using this in 



Umax ^ Smax^ 



This impHes that Smax < {Umax + Zmax)/^, that contradicts 
the definition of Smax in ([37b . ■ 



10 



In Sec. IVI-D[ we will show that there are indeed constants 
Umaxj ^max such that the dynamic algorithm ensures that 

U{t) < Umax. Z(t) < Zmax for all t. 



B. Optimal Control Algorithm 

We now present an online control algorithm that approxi- 
mately solves P4. Similar to the algorithm for the basic model, 
this algorithm also makes use of the following queueing state 
variable X{t) to track the battery charge level and is defined 
as follows: 



X{t) = Y{t) 



Dn 



(39) 



where Qmax is a constant to be specified in (l44l i. Recall that 
Y{t) denotes the actual battery charge level in slot t and 
evolves according to ©. It can be seen that X{t) is simply a 
shifted version of Y{t) and its dynamics is given by: 



X{t + l)=X{t)-D{t) + R{t) 



(40) 



We will show that this definition enables our algorithm to 
ensure that the constraint is met. We are now ready to 
state the dynamic control algorithm. Let (Wi(t), W2{t)^ S{t)) 
be the system state in slot t. Define Q{t)={U(t), Z(t),xlt)) 
as the queue state that includes the workload queue as well as 
auxiliary queues. Then the dynamic algorithm chooses control 
decisions P{t)^R(t)^D{t) and 7(t) as the solution to the 
following optimization problem: 



P6 : 

Max:[/7(t) + Z{t)]P{t) -v\p{t)C(t) + lR{t)Crc - 
+ [X{t) + U{t) + Z{t)]{D{t) - R{t)) 
Subject to: Constraints (|27|) , (|29|) , , © , (0 



lD{t)Cd 



where F > is a control parameter that affects the distance 
from optimality. Let P*(t), and 7*(t) denote the 

optimal solution to P6. Then, the dynamic algorithm allocates 
(l-7*(t))(P*(t)-i?*(t) + D*(t)) power to service the delay 
intolerant workload and the remaining is used for the delay 
tolerant workload. 

After computing these quantities, the algorithm implements 
them and updates the queueing variable X{t) according to 
(|4Q1) . This process is repeated every slot. Note that in solving 
P6, the control algorithm only makes use of the current system 
state values and does not require knowledge of the statistics 
of the workload or unit cost processes. 

We now establish two properties of the structure of the 
optimal solution to P6 that will be useful in analyzing its 
performance later. 

Lemma 5: The optimal solution to P6 has the following 
properties: 

1) If X(t) > —VCmin^ then the optimal solution always 
chooses R^{t) = 0. 

2) If X{t) < -Qmax (where Qmax is specified in (|44])), 
then the optimal solution always chooses D*{t) = 0. 

Proof: See Appendix C. ■ 



C. Solving P6 

Similar to P3, the complexity of solving P6 depends on the 
structure of the unit cost function C{t). For many cases of 
practical interest, P6 is easy to solve and admits closed form 
solutions that can be implemented in real time. We consider 
one such case here. 

1) C{t) does not depend on P{t): For notational conve- 
nience, let Qi{t) = [U{t) + Z{t) - VC{t)] and Q2{t) = 
[X{t)^U{t)^Z{t)]. 

Let Oi{t) denote the optimal value of the objective in P6 
when there is no recharge or discharge. When C{t) does 
not depend on P{t), this can be calculated as follows: If 
U{t) + Z{t) > VC{t), then 0i{t) = Qi{t)Ppeak- Else, 
Oi{t) = Qi{t)W2{t). 

Next, let 92{t) denote the optimal value of the objective in 
P6 when the option of recharge is chosen, so that R{t) > 
0, D{t) = 0. This can be calculated as follows: 

1) If Ql{t) > 0,Q2{t) > 0, then 02(t) = Ql{t)Ppeak - 
VCrc- 

2) If Qlit) > 0,Q2(t) < 0, then 02(1) = Qi{t)Ppeak - 
Q2(t)R max VCrc- 

3) If Qi(t) < 0,Q2{t) > 0, then 02{t) = Qi{t)W2{t) - 

VCrc. 

4) If Qi{t) < 0, Q2{t) < 0, then we have two cases: 

a) If Qi{t) > Q2{t), then ^2(t) = Qi{t){Rmax + 

W2{t)) - Q2{t)R^ax - VCrc- 

b) If Qi{t) < Q2{t), then = Qi{t)W2{t) - VCrc- 
Finally, let 9z{t) denote the optimal value of the objective 
in P6 when when the option of discharge is chosen, so that 
D{t) > 0, R{t) = 0. This can be calculated as follows: 

1) If Qi{t) > 0,Q2{t) > 0, then O^it) = Qi{t)Ppeak + 

Q2{t)D 

max VCdc- 

2) If Qi{t) > 0,Q2(t) < 0, then 03(1) = Qi{t)Ppeak - 
VCac- 

3) If Qi(t) < 0,Q2(t) > 0, then 0^(1) = 

Ql(t) max[0, W2(t) - Dmax] + Q2{t)D^ax - VCdc- 

4) If Qi{t) < 0, Q2{t) < 0, then we have two cases: 

a) If Qi{t) < Q2{t), then 03{t) = Qi{t) max[0, W2{t) - 

max 

b) If Qi{t) > Q2(t), then = Qi{t)W2{t) - VCdc- 
After computing 0i{t),02{t),6s{t), we pick the mode that 
yields the highest value of the objective and implement the 
corresponding solution. 

D. Performance Theorem 

We define an upper bound V^^^ on the maximum value 
that V can take in our algorithm for the extended model. 



Y 

max A ^ max 



Y 



ext 



Xn 



Crr 



(41) 



Then we have the following result. 

Theorem 2: (Algorithm Performance) Suppose U (0) = 0, 
Z(0) = and the initial battery charge level Yinit satisfies 
Ymin < yinit < Ymax- Then implementing the algorithm 
above with any fixed parameter e > such that e < Wmax — 
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Fig. 7. One period of the unit cost process. ^- P^^^^ workload process. 



W2,max and a parameter V such that < V < V^i^ for 
all t G {0,1,2,...} results in the following performance 
guarantees: 

1) The queues lJ(i) and Z(i) are deterministically upper 
bounded by Umax and Zjnax respectively for all t where: 

TT _ At 

A 1 



max — mm 



Wi^rnax 

e 



(42) 
(43) 



Further, the sum U{t) + Z{t) is also deterministically 
upper bounded by Qmax where 



Qmax — ^Xr 



(44) 



2) The queue X{t) is deterministically upper and lower 
bounded for all t as follows: 



^ ^ (t) ^ ^max ^min Qn 



(45) 



3) The actual battery level Y{t) satisfies Fmin < ^(^) < 
^max for all t. 

4) All control decisions are feasible. 

5) The worst case delay experienced by any delay tolerant 
request is given by: 

-2VXmin + Wi^rnax + ^1 

■ (46) 

e 

6) If Wi{t),W2(t) and S{t) are i.i.d. over slots, then the 
time-average cost under the dynamic algorithm is within 
Bext/V of the optimal value: 



1 

lim - VE{P(r)C(r) + lR{T)Crc + IdMC^c} 

t^oo t ^ — i 

^B,,t/V (47) 



< 



where B^xt is a constant given by Bext = {Ppeak + 
DmaxY + — Lin^i — ^ + 5 and (j)ext IS the optimal solu- 
tion to P4 under feasible control algorithm (possibly 
with knowledge of future events). 
Thus, by choosing larger V, the time-average cost under the 
dynamic algorithm can be pushed closer to the minimum 
possible value (j)opt- However, this increases the worst case 
delay bound yielding a 0(1/F, V) utility-delay tradeoff. Also 
limits how large V can be chosen. 
Proof: See Appendix D. ■ 



note that V;^f ^ 



VII. Simulation-based Evaluation 

We evaluate the performance of our control algorithms using 
both synthetic and real pricing data. To gain insights into the 



)- Dynamic Control Algorithm 
^Optimal Offline Cost 
-Minimum Cost 
^Cost with No Battery 




Fig. 9. Average Cost per Hour vs. Ym 



behavior of our algorithms and to compare with the optimal 
offline solution, we first consider the basic model and use a 
simple periodic unit cost and workload process as shown in 
Figs. [7] and [SI These values repeat every 24 hours and the unit 
cost does not depend on P{t). From Fig. [71 it can be seen 
that Cmax = $100 and Cmin = $50. Further, we have that 
Xmin = Cmax = 100. We assumc a slot size of 1 minute so 
that the control decisions on P{t), R{t)^ D{t) are taken once 
every minute. We fix the parameters Rmax = 0.2 MW-slot, 
Dmax = 1.0 MW-slot, Crc = Cdc = O.Ymin = 0. We now 



simulate the basic control algorithm of Sec. IV-All for different 
values of Ymax and with V = Vmax- For each Ymax. the 
simulation is performed for a duration of 4 weeks. 

In Fig. [51 we plot the average cost per hour under the 
dynamic algorithm for different values of battery size Ymax • It 
can be seen that the average cost reduces as Ymax is increased 
and converges to a fixed value for large Ymax^ as suggested 
by Theorem [T] For this simple example, we can compute the 
minimum possible average cost per hour (over all battery sizes) 
and this is given by $33.23 which is also the value to which the 
dynamic algorithm converges as Ymax is increased. Moreover, 
in this example, we can also compute the optimal offline cost 
for each value of Ymax- These also also plotted in Fig. [9] 
It can be seen that, for each Ymax^ the dynamic algorithm 
performs quite close to the corresponding optimal value, even 
for smaller values of Ymax • Note that Theorem 1 provides such 
guarantees only for sufficiently large values of Ymax- Finally, 
the average cost per hour when no battery is used is given by 
$39.90. 

We next consider a six-month data set of average hourly 
spot market prices for the Los Angeles Zone LAI obtained 
from CAISO |[6l. These prices correspond to the period 
01/01/2005-06/30/2005 and each value denotes the average 
price of 1 MW-Hour of electricity. A portion of this data 
corresponding to the first week of January is plotted in Fig. [T] 
We fix the slot size to 5 minutes so that control decisions on 
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No Battery, No WP 

Battery Size 1 5, No WP 

No Battery, WP with V=0.93 

Battery Size 15, WP witin V=0.93 

Battery Size 30, No WP 
- - No Battery, WP with V=1 .98 
Battery Size 30, WP with V=1 .98 

Battery Size 50 

No Battery, WP with V=3.39 

Battery Size 50, WP with V=3.39 




Days 



Fig. 10. Total Cost over 6 months with i.i.d W(t) and different Ym 



P{t)^ R{t)j D{t), etc. are taken once every 5 minutes. The unit 
cost C{t) obtained from the data set for each hour is assumed 
to be fixed for that hour. Furthermore, we assume that the unit 
cost does not depend on the total power drawn P{t). 

In our experiments, we assume that the data center receives 
workload in an i.i.d fashion. Specifically, every slot, W{t) 
takes values from the set [0.1,1.5] MW uniformly at random. 
We fix the parameters Dmax and Rmax to 0.5 MW-slot, Cdc = 



$0.1, and 



0. Also, R 



Wrr 



Rr) 



2.0 MW. We now simulate four algorithms on this setup for 
different values of Ymax- The length of time the battery can 
power the data center if the draw were Wmax starting from 



fully charged battery is given by 



slots, each of length 



5 minutes. We consider the following four control techniques: 
(A) "No battery. No WP," which meets the demand in every 
slot using power from the grid and without postponing any 
workload, (B) "Battery, No WP," which employs the algorithm 
in the basic model without postponing any workload, (C) "No 
Battery, WP," which employs the extended model for WP 
but without any battery, and (D) "Complete," the complete 
algorithm of the extended model with both battery and WP. 
For (C) and (D), we assume that during every slot, half of the 
total workload is delay-tolerant. 

We simulate these algorithms to obtain the total cost over 
the 6 month period for Ymax ^ {15, 30, 50} MW-slot. For (B), 
we use V = Vmax while for (C) and (D), we use V = V^^^ 
with e = Note that an increased battery capacity 

should have no effect on the performance under (C). In order 
to get a fair comparison with the other schemes, we assume 
that the worst case delay guarantee that case (C) must provide 
for the delay tolerant traffic is the same as that under (D). 

Fig. [To] plots the total cost under these schemes over the 6 
month period. In Table [III we show the ratio of the total cost 
under schemes (B), (C), (D) to the total cost under (A) for 
these values of Ymax over the 6 month period. It can be seen 
that (D) provides the most cost savings over the baseline case. 

VIII. Conclusions and Future Work 

In this paper, we studied the problem of opportunistically 
using energy storage devices to reduce the time average elec- 
tricity bill of a data center. Using the technique of Lyapunov 
optimization, we designed an online control algorithm that 
achieves close to optimal cost as the battery size is increased. 



Y 

-'- max 


15 


30 


50 


Battery, No WP 


95% 


92% 


89% 


WP, No Battery 


96% 


92% 


88% 


WP, Battery 


92% 


85% 


79% 



TABLE II 

Ratio of cost under schemes (B), (C), (D) to the cost under (A) 

FOR DIFFERENT VALUES OF Ymax WITH I.I.D. W(t). 



We would like to extend our current framework along 
several important directions including: (i) multiple utilities (or 
captive sources such as DG) with different price variations 
and availability properties (e.g., certain renewable sources of 
energy are not available at all times), (ii) tariffs where the 
utility bill depends on peak power draw in addition to the 
energy consumption, and (iii) devising online algorithms that 
offer solutions whose proximity to the optimal has a smaller 
dependence on battery capacity than currently. We also plan to 
explore implementation and feasibility related concerns such 
as: (i) what are appropriate trade-offs between investments in 
additional battery capacity and cost reductions that this offers? 
(ii) what is the extent of cost reduction benefits for realistic 
data center workloads? and (iii) does stored energy make sense 
as a cost optimization knob in other domains besides data 
centers? Our technique could be viewed as a design tool which, 
when parameterized well, can assist in determining suitable 
configuration parameters such as battery size, usage rules- of- 
thumb, time-scale at which decisions should be made, etc. 
Finally, we believe that our work opens up a whole set of 
interesting issues worth exploring in the area of consumer- 
end (not just data centers) demand response mechanisms for 
power cost optimization. 
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Appendix A -Proof of Lemma[2] 



We can rewrite the expression in the objective of P3 as 

[X{t) + VC{t)]P{t) -^V[lR{t)Crc + lD{t)Cdcl To show part 
1, suppose = J > when X{t) > -VCmin, so that 

we have P*(t) - 5 = W(t), D\t) = 0, lR{t) = 1, and 
Id(^) = 0. Then the value of the objective is given by: 

[X{t) + VC{P%t))]P%t) + VCrc = 

[X{t) + VC{W{t) + S)]{W{t) + (5) + VCrc > 
[X{t)^VC{W{t))]W{t) 

where the last step follows by noting that X{t)^VC{W{t)) > 
when X{t) > —VCmin and that C(t) in non-negative and 
non-decreasing in P{t). The last expression denotes the value 
of the objective when R^{t) = = and all demand 

is met using power drawn from the grid and is smaller. This 
shows that when X{t) > —VCmin, then the optimal solution 
cannot choose > 0. 

Next, to show part 2, suppose = ^ > when X{t) < 

-Vxmin, SO that we have P*(t) + = W{t), = 0, and 

Id(^) = 1. Then the value of the objective is given by: 

[X{t) + FC(P*(t))]P*(t) + VCdc = 
[X{t) + VC{W{t) - S)]{W{t) - J) + VCdc > 
[X{t) + VC{W{t) - S)]{W{t) -S)> 
[X{t)^VC{W{t))]W{t) 

where in the last step, we used the property (fTOl l together with 
the fact that X{t) < —Vxmin- The last expression denotes 
the value of the objective when = D*(t) = and 

all demand is met using power drawn from the grid and is 
smaller. This shows that when X{t) < —Vxmin, then the 
optimal solution cannot choose D*{t) > 0. 



Appendix B - Proof of Bound (l24l) 

Squaring both sides of ([TT]) , dividing by 2, and rearranging 
yields: 

= m_mi _ ^(,)[^(,) _ 

Now note that under any feasible algorithm, at most one 
of R{t) and D{t) can be non-zero. Further, since R{t) < 
Rmax,D{t) < Dmax for all t, wc havc: 

{D{t) - R{t)f ^ max[i?^,, 

2 - 2 ~ 

Taking conditional expectations of both sides given X{t), we 
have: A(X(t)) < B - X{t)¥.{D{t) - R{t)\X{t)}. 

IX. Proof of Lemma 5 

Suppose X(t) > -VCmin and P*(t) > 0,P>*(t) = 0. 
Then, we have that W2 (t) = (1 -7*(t))(P*(t) After 
rearranging, the value of the objective of P6 can be expressed 
as: 

[U{t) + Z{t)]{P%t) - R%t)) - VP\t)C{P\t)) - VCrc 
-X{t)R%t) < [U{t) + Z(t)](P*(t) - R%t)) 

- V[P%t) - R'{t)]C{P%t) - R%t)) 

where we used the inequalities P*{t)C{P*{t) - R*{t)) < 
P* (t)C(P* (t)) and X{t) + FC(P* (t) - i?* (t)) > 0. The first 
follows from the non-negative and non-decreasing property 
of C{t) in P{t). The second follows by noting that X{t) > 
-VCmin- Now note that the right hand side denotes the value 
of the objective when power P{t) = P*{t) — R*{t) is drawn 
from the grid and the battery is not recharged or discharged. 
This is a feasible option since by choosing j{t) = 0, we 
have that 1^2 (^) = {P*{t) - R*{t)). This shows that when 
X{t) > 0, R*{t) > is not optimal. This shows part 1. 

Next, suppose X{t) < -Qmax < -VXmin 

and D*{t) > 0, R*{t) = 0. Then, we have that 
W2{t) = (1 - 7*(t))(P*(t) + P>*(t)). We consider two 
cases: 

(1) P*(t) ^D*{t) < Ppeak' After rearranging, the value of 
the objective of P6 can be expressed as: 

[U{t) + Z{t)]{P%t) + D%t)) - VP%t)C{P%t)) - VCdc 
-^X{t)D%t) < [U{t) + Z{t)]{P%t) + D%t)) 

-VP%t)C{P%t)) -VXm^nD%t) 

where we used the fact that X{t) < —Vxmin and P)*(t) > 0. 
Using the property (ITqI) . we have: 

(P*(t) + D\t))C{P\t) + D\t)) - P\t)C{P\t)) 

<Xm^nD%t) 

Using this in the inequality above, we have: 

[U{t) + Z{t)]{P%t) + D%t)) - VP%t)C{P%t)) - VCdc 
-^X{t)D\t) < [U{t) + Z{t)]{P%t) + D%t)) 

- V{P%t) + D%t))C{P''{t) + D%t)) 
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Note that the last term denotes the value of objective when 
power P{t) = P''{t) + D''{t) < Ppeak is drawn from the 
grid and the battery is not recharged or discharged. This is 
a feasible option since by choosing ^{t) = 0, we have that 
W2{t) = (P*(t) + This shows that for this case, 

when X{t) < -Vxmin, D*{t) > is not optimal. 

(2) P*(t) + > Ppeak- The value of the objective of 

P6 is given by: 

[U{t) + Z{t)]P%t) - VP*{t)C{P*{t)) - VCdc 
+ [U{t) + Z{t) + X{t)]D*{t) < [U{t) + Z{t)]P%t) 
-VP\t)C{P\t)) 

where we used the fact that since X{t) < —Qmax and U (t) + 
Z{t) < Qmax (Theorem [2] part 1), we have [U{t) + Z{t) + 
X{t)]D*{t) < 0. The last term in the inequality above denotes 
the value of the objective when power P{t) = P*(t) is drawn 
from the grid and the battery is not recharged or discharged. To 
see that this is feasible, note that we need (1 — j{t))P*{t) = 
W2{t) where j{t) must be < 1. Since W2{t) < W2,max, this 
implies: 

(t)= ^2 ft) < ^ 

,max 

P* (t^ P* (t^ Ppeak DfYiax 

where we used the fact that P*{t) > Ppeak — D*(t) > Ppeak — 
Dmax' Now since W2,max < Ppeak - D^ax. the last term 
above is < 1, so that choosing P(t) = P*{t) and D{t) = 
is a feasible option. This shows that for this case as well, 
D*{t) > is not optimal. 



Appendix D - Proof of Theorem[21 parts 2-6 

Here, we prove parts 1 — 6 of Theorem [2l 

Proof: (Theorem |2] part 1) We first show (l42l) . Clearly, 
(l42l) holds for t = 0. Now suppose it holds for slot t. We will 
show that it also holds for slot t -\- 1. First suppose U{t) < 
Vxmin- Then, by (l28l) , the most that U (t) can increase in one 

slot is Wi^rnax SO that U{t^l) < VXmin + Wi^rnax- Ncxt, 
suppose VXmin <U{t) < VXmin + Wi^rnax- NoW COnsidcr 

the terms involving P{t) in the objective of P6: [U {t)-\-Z{t) — 
VC{P{t))]P{t). Since U{t) > Vxmin, using property ([TO]), 
we have: 

[u{t)^z{t)-vc{pmp{t)< 

[U{t)^Z{t)-VC{Ppeak)]Ppeak 

Thus, the optimal solution to problem P6 chooses P*{t) = 
Ppeak- Now, let i?*(t), D"" (t) and 7*(t) denote the other con- 
trol decisions by the optimal solution to P6. Then the amount 
of power remaining for the data center (after recharging or 
discharging the battery) is Ppeak - i^*(t) + I^*(t). Out of 
this, a fraction 1 — 7*(t) is used to serve the delay intolerant 
workload. Thus: 

(1 - J%t))[Ppeak - R%t) + D%t)] = W2{t) 



Using this, and the fact that R*{t) < Rmax, we have: 

l%t)[Ppeak - R%t) ^ D%t)] 
= Ppeak -R%t)^D%t)-W2{t) 
> Ppeak - Rmax - W2{t) > Wi{t) 

where we used the fact that Wi{t) + W2{t) < Wmax < 
Ppeak — Rmax - Thus, using (|28]) , it Can be seen that the amount 
of new arrivals to U (t) cannot exceed the total service and this 
yields U{t^l) < U{t). 

(1431) can be shown by similar arguments. Clearly, (l43l l holds 
for t = 0. Now suppose it holds for slot t. We will show that 
it also holds for slot t + 1. First suppose Z{t) < Vxmin- 
Then, by (l36l) . the most that Z{t) can increase in one slot is 
e so that Z{t + 1) < Vxmin + e. Next, suppose Vxmin < 
Z{t) < Vxmin^^- Then, by a similar argument as before, the 
optimal solution to problem P6 chooses P* {t) = Ppeak Now, 
let R*{t),D*{t) and 7*(t) denote the other control decisions 
by the optimal solution to P6. Then the amount of power 
remaining for the data center is Ppeak — R*{t) + D*{t). Out 
of this, a fraction (1 — 7*(t)) is used for the delay intolerant 
workload. Thus: 

(1 - J*{t))[Ppeak - R*{t) + D*{t)] = W2{t) 

Using this, and the fact that R*{t) < Rmax, we have: 

l*{t)[Pp,ak - R*{t) + D*{t)] > Pp,ak - Rmax " W2{t) 
^ Pnea.k Rmax "^2, max ^ ^^max ^^2^max ^ ^ 



where we used the fact that W2{t) < W2^max and Wmax < 
Ppeak — Rmax - Thus, using (l36l) , it Can be seen that the amount 
of new arrivals to Z(t) cannot exceed the total service and this 
yields Z(t + 1) < Z{t). 

(l44l) can be shown by similar arguments and the proof is 
omitted for brevity. ■ 
Proof: (Theorem [2] part 2) We first show that (1451) holds 
for t = 0. Using the definition of X{t) from (l39l) , we have that 

r(o) = x(o) + Q 

max H~ Dmax H~ ^min- SinCC Ymin — ^(0)' 

we have: 



Ymin < X{0) ■ 



Qmax P^max 
Qmax P^max — ^(0) 

Next, we have that Y{0) < Ymax^ so that: 

X(0) H~ Qmax Pmax Ymin — Yf 
7^ ^(0) — Ymax Ymin Qma 



Qn 



< X(0) < 



Combining these two shows that 

Ymax Ymin Qmax Pmax- 

Now suppose (1451) holds for slot t. We will show that it 
also holds for slot t -\- 1. First, suppose —VCmin < X{t) < 
Ymax - Ymin - Qmax - Dmax- Then, from Lemma [3 we 
have that i?*(t) = 0. Thus, using (gOj we have that X(t + 

1) < X{t) < Ymax - Ymin - Qmax - Dmax- Ncxt, SUppOSC 

X{t) < —VCmin- Then, the maximum possible increase is 

Rmax so that X(t + 1) < -VCmin + Rmax- NoW for all V 

we have that -VCmin + Rmax < 
c ~l~ ^^^l,mox ^) VXmin Ymax 

. This follows from the definition (|4T1) 



such that < V < V^^ 



Y 

± Ton 



-Dm 
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and the fact that Xmin ^ Cmin- Using this, we have X{t -\- 
1) < Ymax - ymin - Qmax - Dmax for this case as Well. This 
estabhshes that X(t + 1) < Ymax - ymin - Q max -Drfjidx- 

Next, suppose -Qmax - Dmax < X{t) < -Qmax- Then, 

from Lemma [51 we have that = 0. Thus, using (l40l) we 

have that + > X{t) > —Qmax — Dmax- Next, suppose 
—Qmax < ^{t)- Then, the maximum possible decrease is 
Dmax SO that X(t + 1) > -Qmax - Dmax for this casc as 
well. This shows that X{t-\-l) > —Qmax — Dmax- Combining 
these two bounds proves (1451) . ■ 

Proof: (Theorem [2] parts 3 and 4) Part 3 directly follows 
from gB and (ED). Using Y{t) = X{t)^Qmax^D 

max~^ymin 

in the lower bound in (1451) , we have: 



1}, we get 



Dmax < Y{t) 



Qn 



ymin ^ y 



Similarly, using Y{t) = + 
upper bound in (1451) , we have: 



Dr, 



y {t^ Qmax Dy] 
y if) — Ymaa 



Y <Y 



Y 

J. Ton 



D^ 



Part 4 now follows from part 3 and the constraint on P{t) in 
P6. ■ 

Proof: (Theorem [2] part 5) This follows from part 1 and 
Lemma (H ■ 

Proof: (Theorem[2]part 6) We use the following Lyapunov 
function: L{Q{t))^\{U'^{t) + Z'^(t) + X'^(t)). Define the 
conditional 1-slot Lyapunov drift as follows: 

A(Q(t))AE {L{Q(t + 1)) - L{Q(t))\Q(t)} (48) 

Using (1281) , (l36l) , (l4Ql) , the drift + penalty term can be bounded 
as follows: 

A(Q(t)) + {P{t)C{t) + lR{t)Crc + IdWC^cIQW} < 
5e.t - [U(t) + Z(t)]E{P(t)|Q(t)} - W2(t)[U(t) + Z(t)] 
- [X(t) + U(t) + Z(t)]E {I)(t) - R(t)\Q(t)} 
+ FE{P(t)C(t) + lR(t)Crc + IdWC^cIQW} (49) 
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B. Com- 



paring this with P6, it can be seen that given any queue value 
X{t), our control algorithm is designed to minimize the right 
hand side of (|49l ) over all possible feasible control policies. 
This includes the optimal, stationary, randomized policy given 
in Lemma [3] Using the same argument as before, we have the 
following: 

A{Q{t)) + VE {P{t)C{t) + lR{t)Crc + lD{t)Cdc\Qm < 

Bext + VE [P(t)C{t) + iR{t)Crc + iD(f)Cdc\Q{t)] 
= Bext + V(j)ext 

Taking the expectation of both sides and using the law of 
iterated expectations and summing over t G {0,1,2,...,T — 



T-l 

E 

t=0 



VE{P{t)C{t) + lR{t)Crc + lD{t)Cdc} < 



Be^tT + VT4>,^t - E {L{Q{T))} + E {L(Q(0))} 
Dividing both sides by VT and taking limit as T — > oo yields: 



T-l 



lim - VE{P(t)C(t) + lR{t)Crc + lD{t)Cdc} < 
T^OO 1 ^ — ^ 



'^ext 



t=0 
Bext/V 



where we used the fact that E{L(Q(0))} is finite and that 
E{L(Q(T))} is non-negative. ■ 

Appendix E - Full Solution for Convex, 
Increasing C{t) 

Let C\S, P) be the derivative of (7(5', P) with respect to P. 
Also, let P' denote the solution to the equation C'{S^ P) = 
and C{P') = C{S^P'). Then, the optimal solution can be 
obtained as follows: 

^x 1) If Plow <P'< W{t), then i?*(t) = and we have the 
following two cases: 

a) If P'{X(t) + VC{P')) + VCdc < 0(t), then P*(t) = 
P^ D''{t) = W{t)-P'. 

b) Else, draw all power from the grid. This yields 
P>*(t)=Oand P*(t) = W(t). 

2) If W{t) <P' < Phigh, then P)*(t) = and we have the 
following two cases: 

a) If P'{X{t) + VC{P')) + VCrc < 0{t), then P*(t) = 
P^ R*{t) = P' -W{t). 

b) Else, draw all power from the grid. This yields R* (t) = 
Oand P%t) = W{t). 

3) If P^i^^ < P\ then R*{t) = and we have the following 
two cases: 

a) If PMgh{X{t) + VC{Ph^gh)) + VCdc < 0{t), then 

P* (t) = Ph^gh, D^t) = W{t) - Pugh- 

b) Else, draw all power from the grid. This yields 
P)*(t) =0 and P*(t) = W{t). 

4) If P' < Plow, then P)*(t) = and we have the following 
two cases: 

a) If Piow{X{t) + VC{Piow)) + VCrc < 0{t), then 
P'^it) = Plow, R^'it) = Plow -W{t). 

b) Else, draw all power from the grid. This yields R* (t) = 
and P*{t) = W{t). 



